来自多个RGB摄像机的无标记人类运动捕获(MOCAP)是一个广泛研究的问题。现有方法要么需要校准相机,要么相对于静态摄像头校准它们,该摄像头是MOCAP系统的参考框架。每个捕获会话都必须先验完成校准步骤,这是一个乏味的过程,并且每当有意或意外移动相机时,都需要重新校准。在本文中,我们提出了一种MOCAP方法,该方法使用了多个静态和移动的外部未校准的RGB摄像机。我们方法的关键组成部分如下。首先,由于相机和受试者可以自由移动,因此我们选择接地平面作为常见参考,以代表身体和相机运动,与代表摄像机坐标中身体的现有方法不同。其次,我们了解相对于接地平面的短人类运动序列($ \ sim $ 1SEC)的概率分布,并利用它在摄像机和人类运动之间消除歧义。第三,我们将此分布用作一种新型的多阶段优化方法的运动,以适合SMPL人体模型,并且摄像机在图像上的人体关键点构成。最后,我们证明我们的方法可以在从航空摄像机到智能手机的各种数据集上使用。与使用静态摄像头的单眼人类MOCAP任务相比,它还提供了更准确的结果。我们的代码可在https://github.com/robot-ception-group/smartmocap上进行研究。
translated by 谷歌翻译
为了追踪和运动捕获(MOCAP)在其自然栖息地中的动物,非常适合安全和无声的空中平台,例如带有车载摄像机的飞艇。但是,与多旋转器不同,飞艇受到严格的运动限制和受环境风的影响。它们的方向和飞行方向也紧密耦合。因此,用于感知任务的基于最新的MPC的形成控制方法不适用于飞艇团队。在本文中,我们首先利用飞艇的空速与其与主题的距离之间的定期关系来解决这个问题。我们使用它来得出满足MOCAP感知约束的分析和数字解决方案。基于此,我们开发了一个基于MPC的编队控制器。我们对解决方案进行了详细的分析,包括改变物理参数(例如攻击角度和俯仰角)的影响。提出了广泛的仿真实验,比较了不同的形成大小,不同的风条件和各种受试者速度的结果。还包括我们关于真实飞艇的方法的演示。我们已经在https://github.com/robot-pocepepon-group/airship-mpc上发布了所有源代码。可以在https://youtu.be/ihs0_vrd_kk上观看描述我们方法和结果的视频。
translated by 谷歌翻译
在本文中,我们为全向机器人提供了一种积极的视觉血液。目标是生成允许这样的机器人同时定向机器人的控制命令并将未知环境映射到最大化的信息量和消耗尽可能低的信息。利用机器人的独立翻译和旋转控制,我们引入了一种用于活动V-SLAM的多层方法。顶层决定提供信息丰富的目标位置,并为它们产生高度信息的路径。第二个和第三层积极地重新计划并执行路径,利用连续更新的地图和本地特征信息。此外,我们介绍了两个实用程序配方,以解释视野和机器人位置的障碍物。通过严格的模拟,真正的机器人实验和与最先进的方法的比较,我们证明我们的方法通过较小的整体地图熵实现了类似的覆盖结果。这是可以获得的,同时保持横向距离比其他方法短至39%,而不增加车轮的总旋转量。代码和实现详细信息作为开源提供。
translated by 谷歌翻译
While the capabilities of autonomous systems have been steadily improving in recent years, these systems still struggle to rapidly explore previously unknown environments without the aid of GPS-assisted navigation. The DARPA Subterranean (SubT) Challenge aimed to fast track the development of autonomous exploration systems by evaluating their performance in real-world underground search-and-rescue scenarios. Subterranean environments present a plethora of challenges for robotic systems, such as limited communications, complex topology, visually-degraded sensing, and harsh terrain. The presented solution enables long-term autonomy with minimal human supervision by combining a powerful and independent single-agent autonomy stack, with higher level mission management operating over a flexible mesh network. The autonomy suite deployed on quadruped and wheeled robots was fully independent, freeing the human supervision to loosely supervise the mission and make high-impact strategic decisions. We also discuss lessons learned from fielding our system at the SubT Final Event, relating to vehicle versatility, system adaptability, and re-configurable communications.
translated by 谷歌翻译
This paper presents our solutions for the MediaEval 2022 task on DisasterMM. The task is composed of two subtasks, namely (i) Relevance Classification of Twitter Posts (RCTP), and (ii) Location Extraction from Twitter Texts (LETT). The RCTP subtask aims at differentiating flood-related and non-relevant social posts while LETT is a Named Entity Recognition (NER) task and aims at the extraction of location information from the text. For RCTP, we proposed four different solutions based on BERT, RoBERTa, Distil BERT, and ALBERT obtaining an F1-score of 0.7934, 0.7970, 0.7613, and 0.7924, respectively. For LETT, we used three models namely BERT, RoBERTa, and Distil BERTA obtaining an F1-score of 0.6256, 0.6744, and 0.6723, respectively.
translated by 谷歌翻译
In recent years, social media has been widely explored as a potential source of communication and information in disasters and emergency situations. Several interesting works and case studies of disaster analytics exploring different aspects of natural disasters have been already conducted. Along with the great potential, disaster analytics comes with several challenges mainly due to the nature of social media content. In this paper, we explore one such challenge and propose a text classification framework to deal with Twitter noisy data. More specifically, we employed several transformers both individually and in combination, so as to differentiate between relevant and non-relevant Twitter posts, achieving the highest F1-score of 0.87.
translated by 谷歌翻译
Osteoarthritis (OA) is the most prevalent chronic joint disease worldwide, where knee OA takes more than 80% of commonly affected joints. Knee OA is not a curable disease yet, and it affects large columns of patients, making it costly to patients and healthcare systems. Etiology, diagnosis, and treatment of knee OA might be argued by variability in its clinical and physical manifestations. Although knee OA carries a list of well-known terminology aiming to standardize the nomenclature of the diagnosis, prognosis, treatment, and clinical outcomes of the chronic joint disease, in practice there is a wide range of terminology associated with knee OA across different data sources, including but not limited to biomedical literature, clinical notes, healthcare literacy, and health-related social media. Among these data sources, the scientific articles published in the biomedical literature usually make a principled pipeline to study disease. Rapid yet, accurate text mining on large-scale scientific literature may discover novel knowledge and terminology to better understand knee OA and to improve the quality of knee OA diagnosis, prevention, and treatment. The present works aim to utilize artificial neural network strategies to automatically extract vocabularies associated with knee OA diseases. Our finding indicates the feasibility of developing word embedding neural networks for autonomous keyword extraction and abstraction of knee OA.
translated by 谷歌翻译
Neural models that do not rely on pre-training have excelled in the keyphrase generation task with large annotated datasets. Meanwhile, new approaches have incorporated pre-trained language models (PLMs) for their data efficiency. However, there lacks a systematic study of how the two types of approaches compare and how different design choices can affect the performance of PLM-based models. To fill in this knowledge gap and facilitate a more informed use of PLMs for keyphrase extraction and keyphrase generation, we present an in-depth empirical study. Formulating keyphrase extraction as sequence labeling and keyphrase generation as sequence-to-sequence generation, we perform extensive experiments in three domains. After showing that PLMs have competitive high-resource performance and state-of-the-art low-resource performance, we investigate important design choices including in-domain PLMs, PLMs with different pre-training objectives, using PLMs with a parameter budget, and different formulations for present keyphrases. Further results show that (1) in-domain BERT-like PLMs can be used to build strong and data-efficient keyphrase generation models; (2) with a fixed parameter budget, prioritizing model depth over width and allocating more layers in the encoder leads to better encoder-decoder models; and (3) introducing four in-domain PLMs, we achieve a competitive performance in the news domain and the state-of-the-art performance in the scientific domain.
translated by 谷歌翻译
Privacy policies provide individuals with information about their rights and how their personal information is handled. Natural language understanding (NLU) technologies can support individuals and practitioners to understand better privacy practices described in lengthy and complex documents. However, existing efforts that use NLU technologies are limited by processing the language in a way exclusive to a single task focusing on certain privacy practices. To this end, we introduce the Privacy Policy Language Understanding Evaluation (PLUE) benchmark, a multi-task benchmark for evaluating the privacy policy language understanding across various tasks. We also collect a large corpus of privacy policies to enable privacy policy domain-specific language model pre-training. We demonstrate that domain-specific pre-training offers performance improvements across all tasks. We release the benchmark to encourage future research in this domain.
translated by 谷歌翻译
While pre-trained language models (LM) for code have achieved great success in code completion, they generate code conditioned only on the contents within the file, i.e., in-file context, but ignore the rich semantics in other files within the same project, i.e., cross-file context, a critical source of information that is especially useful in modern modular software development. Such overlooking constrains code language models' capacity in code completion, leading to unexpected behaviors such as generating hallucinated class member functions or function calls with unexpected arguments. In this work, we develop a cross-file context finder tool, CCFINDER, that effectively locates and retrieves the most relevant cross-file context. We propose CoCoMIC, a framework that incorporates cross-file context to learn the in-file and cross-file context jointly on top of pretrained code LMs. CoCoMIC successfully improves the existing code LM with a 19.30% relative increase in exact match and a 15.41% relative increase in identifier matching for code completion when the cross-file context is provided.
translated by 谷歌翻译